Critical Lock Analysis: Diagnosing Critical Section Bottlenecks in Multithreaded Applications on Multicore Systems
نویسنده
چکیده
Critical sections are well known potential performance bottlenecks in multithreaded applications and identifying the ones that inhibit scalability are important for performance optimizations. While previous approaches use idle time as a key measure, we show such a measure is not reliable. The reason is that idleness does not necessarily mean the critical section is on the critical path. We introduce critical lock analysis, a new method for diagnosing critical section bottlenecks in multithreaded applications. Our method firstly identifies the critical sections appearing on the critical path, and then quantifies the impact of such critical sections on the overall performance by using quantitative performance metrics. Case studies show that our method can successfully identify critical sections that are most beneficial for improving overall performance as well as quantify their performance impact on the critical path, which results in a more reliable establishment of the inherent critical section bottlenecks than previous approaches.
منابع مشابه
Remote Core Locking: Migrating Critical-Section Execution to Improve the Performance of Multithreaded Applications
The scalability of multithreaded applications on current multicore systems is hampered by the performance of lock algorithms, due to the costs of access contention and cache misses. In this paper, we propose a new lock algorithm, Remote Core Locking (RCL), that aims to improve the performance of critical sections in legacy applications on multicore architectures. The idea of RCL is to replace l...
متن کاملTransactional Execution: Toward Reliable, High-Performance Multithreading
0272-1732/03/$17.00 2003 IEEE Published by the IEEE computer Society Explicit hardware support for multithreaded software, either in the form of shared-memory chip multiprocessors or hardware multithreaded architectures, is becoming increasingly common. As such support becomes available, application developers are expected to exploit these developments by employing multithreaded programming. ...
متن کاملEfficient locking for multicore architectures
The scalability of multithreaded applications on current multicore systems is hampered by the performance of critical sections, due in particular to the costs of access contention and cache misses. In this paper, we propose a new locking technique, Remote Core Locking (RCL) that aims to improve the performance of critical sections in legacy applications on multicore architectures. The idea of R...
متن کاملThread Migration to Improve Synchronization Performance
A number of prior research efforts have investigated thread scheduling mechanisms to enable better reuse of data in a processor’s cache. We propose to exploit the locality of the critical section data by enforcing an affinity between locks and the processor that has cached the execution state of the critical section protected by that lock. We investigate the idea of migrating threads to the “lo...
متن کاملEffectiveness of Compiler-Directed Prefetching on Data Mining Benchmarks
For today's increasingly power-constrained multicore systems, integrating simpler and more energy-e±cient in-order cores becomes attractive. However, since in-order processors lack complex hardware support for tolerating long-latency memory accesses, developing compiler technologies to hide such latencies becomes critical. Compiler-directed prefetching has been demonstrated e®ective on some app...
متن کامل